Redundancy of Lossless Data Compression for Known Sources by Analytic Methods

نویسندگان

Michael Drmota

Wojciech Szpankowski

چکیده

Lossless data compression is a facet of source coding and a well studied problem of information theory. Its goal is to find a shortest possible binary code that can be unambiguously recovered. In this paper focus on rigorous analysis of code redundancy for known sources. The redundancy rate problem determines by how much the actual code length exceeds the optimal code length. We present precise analyses of three types of lossless data compression schemes, namely fixed-to-variable (FV) length codes, variable-to-fixed (VF) length codes, and variableto-variable (VV) length codes. In particular, we investigate the average redundancy of Shannon, Huffman, Tunstall, Khodak and Boncelet codes. These codes have succinct representations as trees, either as coding or parsing trees, and we analyze here some of their parameters (e.g., the average path from the root to a leaf). Such trees are precisely analyzed by analytic methods, known also as analytic combinatorics, in which complex analysis plays decisive role. These tools include generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, Tauberian theorems, and singularity analysis. We coined the term analytic information theory for problems of information theory studied by analytic tools. This approach lies on the crossroad of information theory, analysis of algorithms., and combinatorics. M. Drmota and W. Szpankowski. Redundancy of Lossless Data Compression for Known Sources by Analytic Methods. Foundations and TrendsR © in Communications and Information Theory, vol. XX, no. XX, pp. 1–139, 2016. DOI: 10.1561/XXXXXXXXXX.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Average Redundancy for Known Sources: Ubiquitous Trees in Source Coding∗

Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard’s precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, and singularity analysis. This approach ...

متن کامل

فشرده‌سازی تصویر با کمک حذف و کدگذاری هوشمندانه اطلاعات تصویر و بازسازی آن با استفاده از الگوریتم های ترمیم تصویر

Compression can be done by lossy or lossless methods. The lossy methods have been used more widely than the lossless compression. Although, many methods for image compression have been proposed yet, the methods using intelligent skipping proper to the visual models has not been considered in the literature. Image inpainting refers to the application of sophisticated algorithms to replace lost o...

متن کامل

Generalized Shannon Code Minimizes the Maximal Redundancy

Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Hu man constructed his optimal code that minimizes the average code length among all pre x codes for known sources. Actually, Hu man codes minimizes the average redundancy de ned as the di erence between the code len...

متن کامل

Lossless Microarray Image Compression by Hardware Array Compactor

Microarray technology is a new and powerful tool for concurrent monitoring of large number of genes expressions. Each microarray experiment produces hundreds of images. Each digital image requires a large storage space. Hence, real-time processing of these images and transmission of them necessitates efficient and custom-made lossless compression schemes. In this paper, we offer a new archi...

متن کامل

On the Average Coding Rate of the Tunstall Code for Stationary and Memoryless Sources

The coding rate of a one-shot Tunstall code for stationary and memoryless sources is investigated in non-universal situations so that the probability distribution of the source is known to the encoder and the decoder. When studying the variable-to-fixed length code, the average coding rate has been defined as (i) the codeword length divided by the average block length. We define the average cod...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Foundations and Trends in Communications and Information Theory

دوره 13 شماره

صفحات -

تاریخ انتشار 2017

Redundancy of Lossless Data Compression for Known Sources by Analytic Methods

نویسندگان

چکیده

منابع مشابه

Average Redundancy for Known Sources: Ubiquitous Trees in Source Coding∗

فشرده‌سازی تصویر با کمک حذف و کدگذاری هوشمندانه اطلاعات تصویر و بازسازی آن با استفاده از الگوریتم های ترمیم تصویر

Generalized Shannon Code Minimizes the Maximal Redundancy

Lossless Microarray Image Compression by Hardware Array Compactor

On the Average Coding Rate of the Tunstall Code for Stationary and Memoryless Sources

عنوان ژورنال:

اشتراک گذاری